Automatic language identification
نویسندگان
چکیده
Automatic language identification is the process by which the language of a digitized speech utterance is recognized by a computer. In this paper, we will describe the set of available cues for language identification and discuss the different approaches to building working systems. This overview includes a range of historic approaches, contemporary systems that have been evaluated on standard databases, as well as promising future approaches. Comparative results are also reported.
منابع مشابه
Incorporating linguistic knowledge into automatic dialect identification of Spanish
Automatic dialect identification, like automatic language identification , has often been approached through the use of phonetic frequencies and phonetic sequence modeling. While such statistical systems perform well on language identification problems, they are less adept at the more difficult problem of automatic dialect identification, particularly on short segments of speech. In this paper ...
متن کاملKohonen Self Organizing for Automatic Identification of Cartographic Objects
Automatic identification and localization of cartographic objects in aerial and satellite images have gained increasing attention in recent years in digital photogrammetry and remote sensing. Although the automatic extraction of man made objects in essence is still an unresolved issue, the man made objects can be extracted from aerial photos and satellite images. Recently, the high-resolution s...
متن کاملText - Based Automatic Language Identification
— We present a statistical approach to text-based automatic language identification that focuses on discrimination between as opposed to representation of different language models. The system is evaluated on a text corpus containing six African and six European languages.
متن کاملAutomatic identification of language varieties: The case of Portuguese
Automatic Language Identification of written texts is a well-established area of research in Computational Linguistics. Stateof-the-art algorithms often rely on n-gram character models to identify the correct language of texts, with good results seen for European languages. In this paper we propose the use of a character n-gram model and a word n-gram language model for the automatic classifica...
متن کاملAn approach to automatic figurative language detection: A pilot study
This pilot study explores a new approach to automatic detection of figurative language. Our working hypothesis is that the problem of automatic identification of idioms (and metaphors, to some extent) can be reduced to the problem of identifying an outlier in a dataset. By an outlier we mean an observation which appears to be inconsistent with the remainder of a set of data.
متن کاملFrom perceptual designs to linguistic typology and automatic language identification : overview and perspectives
This paper deals with the overview of the methods in perceptual language identification and the suggestion of a new approach based on a two-step methodology integrating to perception “genetic” considerations and resulting into the modeling of perceptually identified discriminative cues. The first study reported here concerns experimental designs for perceptual and automatic identification of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 35 شماره
صفحات -
تاریخ انتشار 2001